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RELIABILITY OF LABORATORY TESTS OF VSTOL AND OTHER LONG-DURATION NOISES 
K. D. Kryter, D. J. Peeler, M. E. Dobbs and J. S. Lukas 
Stanford Research Institute, Menlo Park, California 94025 

INTRODUCTION 

The specification of maximum limits allowable for noise from aircraft and 
the noise limits to be allowed in communities are based on methods of noise 
measurement that take into account the spectral and temporal characteristics 
of the noise — the so-called Effective Perceived Noise Level (EPNL) either in 
EPNdB, EdBA, EdBD , etc. units. These units of noise measurement have been 
developed largely through subjective judgment tests conducted in laboratories. 

One of the perplexing problems with these noise measurement evaluation 
procedures is the inconsistency in the results of some laboratory tests when 
the noise stimuli to be judged varied significantly with respect to both 
spectral content and duration or temporal pattern. The difficulty, however, 
may be more related to the experimental procedures followed in the laboratory 
than in how predictive the methods would be for reactions to the noises in 
"real-life". For example, in recent tests (Ref. 1) conducted to evaluate the 
effectiveness of the nacelle-noise reduction (Ref. 2 and 3) it was found that 
the units of noise measurement that take the duration of the noises into 
account do no better (even slightly worse because of the greater unreliabil- 
ity introduced by the increased number of physical measures required for 
obtaining effective values) in predicting the subjective judgments than do’ 
those units of measurement that reflected only the maximum level reached dur- 
ing the noise occurrence. This would be contrary to other test results 
(Ref. 4) and the common observation that the longer the duration of a noise 
the more objectionable it is. Does this mean, that noise measurements, at 
least for some classes of noises, should not utilize the duration information? 
The answer is probably no, because the noises in these tests were all of about 



equal duration and, accordingly, the maximum intensity level and spectrum of 
each noise solely determined its relative judged noisiness. 

Inasmuch as decisions regarding the use of particular methods of noise 
measurement and the modification of these methods and their standardization on 
an industry-government-wide basis depend to a large extent on laboratory test 
results, it is important to continue to verify and upgrade the methods used 
in these tests. Also, the noise of VSTOL aircraft represents a relatively new 
type of aircraft noise that differs from the noise of present-day fixed-wing 
aircraft with regard to both . spectral content and duration. Accordingly, tests 
were conducted to study: (1) the judged annoyance effect of VSTOL noise in 

comparison with other present-day noises, (2) the reliability of the research 
methods used in these judgment tests; and (3) the relative accuracy of various 
older and two newer units recently proposed by S. S. Stevens (Ref. 5) of 
noise measurement in predicting the subjective perceived noisiness or unaccept- 
ableness of aircraft or other complex noises. 

PROCEDURE 

Acoustic Environment . All tests were conducted in an anechoic chamber 
which had 21-inch long fiberglass wedges on all six surfaces (see Fig. 1). 
Measured from the tips of the wedges the internal dimensions of the anechoi-c 
chamber were 8.5 by 17.75 by 8 feet. The noises to be judged were presented 
via two Altec-Lansing A7-500 speaker systems each driven by an 80 watt 
McIntosh power amplifier. Conventional playback circuitry was employed with 
the exception of artificial quieting of the system noise between stimulus pre- 
sentations and the use of an equalization network designed to provide as flat 
as possible frequency response at the listener positions within the room. A 
block diagram, with manufacturer's name and model number of commercial equip- 
ment used specified, is provided in Fig. 2. 



ALTEC-LANSING A7-500 
SPEAKER SYSTEM /v 
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FIGURE 1. Showing anechoic chamber and location of subjects chairs and loudspeakers. 
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Each speaker system was directed at four subjects seated in an arc of 
radius of 8-1/2 feet. The chord of each arc was approximately 5 feet. The 
sound pressure level of octave bands of noise with center frequencies ranging 
from 63 to 8000 cycles varied by less than + 2-1/2 dB at any listener position. 
A low-pass filter with 3 dB downpoint at 8000 Hz was used to minimize tape 
hiss . 

Physical Analysis. Physical measures of noises were computed from one- 
third octave band sound pressure levels sampled and averaged over 1/2 second 
time intervals. A General Radio Type 1921 Real-Time Analyzer was used to 
produce, each 1/2 second, sound pressure level measurements in 24 one- third 
octave bands covering the frequency range 50 to 10,000 Hz. These data were 
recorded and processed in digital form. The end results of the analysis 
include the time-histories of sound pressure levels in each of the 24 bands 
and the so-called maximum (Max PNL) and effective levels (EPNL) of various 
weighted measures dBA, dBC, dBD , dBE , PLdB , PNdB , PNdBM, PNdB and PNdBM cor- 
rected for tonal content by two procedures and designated by the subscripts 
tl and t2 . These units and related frequency weightings and calculation pro- 
cedures are given in detail in Refs. 3, 4, and 5 and are summarized in Table 1 
and Figure 3. 

Noise Stimuli . The various noise stimuli used in the judgment tests are 
described in Table 2. It is seen in Table 2 that the so-called VSTOL noises 
were actually simulated to have the spectra and approximate durations believed 
to be typical for such aircraft. All the noises were recorded onto a master 
tape with the same dBD^ peak level. The relative intensity levels of the test 
items were appropriately varied by means of an attenuator during re-recording 
onto test tapes from the master tape. These test tapes were then played via 
loudspeakers to the listeners in the anechoic chamber. The equipment used for 
the making of ,the test tapes is shown in Figure 4. 
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Table 1 


UNITS. OF PHYSICAL NOISE MEASUREMENT 


I dBA, dBD^, ^BC, an d dBE are sou nd level meter, with specified 
weightings (see Fig. 3) and meter action set on "slow". 

A and C weighting (dBA,dBC) and”slow" meter action are defined 
in Ref. 6. D 2 weighting (dBD 2 > and PNdB and PNdBM with and 
without pure- tone corrections (t^ and t^) are defined in Ref 4. 

PLdB and E weighting (dBE) are defined in Ref. 5. 

II PNL is the level for each of the above units present in each 
successive half-second interval of a noise occurrence. 

Max PNL is the maximum level reached on a sound level meter 

overall weighted frequencies or the maximum PNdB, PNdBM. (t^) 

(t ) and PLdB level reached during successive half-second intervals 
2 J 

of a noise occurrence. 

Ill EPNL is taken as the integration on a 10 log^ basis of the half- 
second PNL values present between the 10 dB downpoints from the 
half-second interval in which the max PNL occurred. 




RELATIVE RESPONSE 



20 100 1000 10000 

FREQUENCY - Hz 


FIGURE 3. Showing frequency weightings applied to overall sound level 
measurements (see ref. 4, 5, and 6). 
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Table 2 NOISES USED IN PAIRED-COMPARISON AND MAGNITUDE ESTIMATION TESTS 
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Recordings furnished by NASA Langley Research Center. 





Both tape recorders aligned with Ampex alignment tape number 01-31321-01 for equal VU output (play and record). 
After alignment, exact level dub is obtained with attenuator and amplifier settings as shown. 


FIGURE 4. BLOCK DIAGRAM FOR DUBBING STIMULUS PRESENTATION TAPES 
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Paired Comparison Test . Each of the aircraft noises and the room air- 
conditioner noise were paired with each of the two standards to form a pair of 
noises. The subjects were asked (see Appendix A) to judge which in each pair 
they considered the more unacceptable, bothersome or annoying. The levels of 
the standard pink-noise (the output of an electronic random noise generator 
shaped to have a low frequency roll-off below 63 Hz of 3 dB per octave and a 
high frequency roll-off above 500 Hz of 6 dB per octave) and the other com- 
parison noises were presented at the levels indicated on Table 2. Each com- 
parison noise was paired at each indicated level twice with each of the 
standards, once occurring first in the pair and once second in the pair. The 
percentage of subjects who judged each of the comparison noises at each of its 
levels and in each order of presentation (when preceding and when following 
the standard noise in a pair) was placed on a graph showing percentage of sub- 
jects plotted against the sound pressure level of the comparison noise. The 
level, as determined from the resulting curve, at which 50 percent of the sub- 
jects would indicate . the comparison noise was the more unacceptable (or 50 
percent would indicate the standard to be the more unacceptable) was taken as 
the level of the noise as measured by a given unit of physical sound measure- 
ment that provided subjective equality with the standard. The values found 
for the two orders of presentation were average to provide an answer presum- 
ably free of the "time" error often present in such judgments due to the order 
in which a noise appears in each pair. An example of a paired-comparison test 
function for aircraft noise F-l is given in the upper graph in Figure 5. 

Fifty pairs of noises, requiring about 20 minutes, were presented to the sub- 
jects in a single session with a minimum of 10. minutes rest between sessions. 
The pairs of noises were recorded with a 4 second pause between pairs and about 
1 second pause between noises within a pair. Every 5th pair was preceded with 
a pair number announcement (recorded on the test tape) and a weak intensity 
beep tone separated the other pairs. The sequencing of comparison noises and 
levels was randomized on the test tapes . 
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EXAMPLE OF PAIRED-COMPARISON FUNCTION 



MAX PNL — dBA of aircraft noise - FI 


FIGURE 5. Examples of graphs developed from Paired-Comparison (upper) and 
Magnitude-Estimation (lower) test results. 
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Magnitude Estimation Test . Each of the noises at each of the levels used 
in the paired-comparison test was re-recorded .in a random order onto a master 
test tape with about 3 seconds between noises, with 81 noises on a given tape. 
However, the first noise on each test tape was the standard pink noise SI pre- 
sented at a maximum level of 80 dBA . The subjects were instructed to ascribe 
to the magnitude of relative unacceptableness of that noise the number 10, and 
to judge each succeeding noise in relation to the standard and assign an 
appropriate number to it, e.g., if the second sound appeared to be twice as 
noisy or unacceptable it was to be given the number 20. The specific instruc- 
tions appear in Appendix A. 

The average for all subjects of a particular group of these magnitude 
judgments for a given noise were then plotted against the level, as measured 
by a given unit of physical noise measurement, at which the noise in question 
was presented. The standard noise was presented at its specified level 5 times 
during the course of 1 test tape of 81 noises and the average numerical magni- 
tude of these five ratings was noted. 

Using the average numerical magnitude given the standard, the graphs pre- 
pared for each of the other noises was then entered to determine the physical 
noise levels required, using a particular unit of physical noise measurement of 
each of the noises required, to achieve the equal numerical magnitude of sub- 
jective noiseiness. An example of how the magnitude estimation data were 
interpreted is given in the lower graph of Figure 5. 

Instructions to the Subjects . Two different sets of instructions were 
prepared for both the paired -comparison test and the magnitude estimation 
test, as shown in Appendix A. One set was relatively detailed and repetitious 
with the intent of making the subjects concentrate and consider the whole 
noise occurrence and not just the peak levels that occurred in each noise. 

The appropriate instructions were repeated at the beginning of each rest 
period between sessions. 
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The second set of instructions were abbreviated as much as thought 
possible and given but once at the start of each type of test procedure. 

Subjects . Three groups of 24 subjects each were selected. The subjects 
consisted of 11 male and 25 female college students and 9 housewives, all of 
whom reported they had no hearing difficulties. Group I was given the paired- 
comparison test first and about 2 weeks later, the magnitude-estimation test. 
The "long” form of instructions was used with Group I as with Group II. How- 
ever, Group II received the magnitude-estimation test first and the paired- 
comparison test second about 1 week later. Group III received the short form 
of the instructions, the paired-comparison test first and the magnitude- 
estimation test 2 days later. 

RESULTS AND DISCUSSION 

Magnitude Estimation . The standard deviation(S .D .) statistic is used as a 
means of evaluating the accuracy with which each of the units of physical 
noise measurement predicted the magnitude estimation judgments made by the 
three groups of subjects. Normal probability statistics have been commonly 
used in the analysis of this type of judgment data (Refs. 1, 2, 3, 4, 5, 6, 7, 

8, see particularly Ref. 2). The results are shown in Table 3. 

These standard deviations are calculated by taking the square root of the 

average of the sum of the squared differences between: (1) the average level, 

as measured by a given physical unit required of each of the noises in order 
that they each be judged to have the same subjective magnitude; and (2) the 
level of each individual noise when judged to have the same magnitude (see 
Fig. 5). In formula, this is as follows: 

S.D. = /E (X-M) 2 
N-l 

If there were perfect agreement between the physical unit of measurement 
and the subjective judgments, the physical levels would, of course, have 
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identical values and the distribution of differences from the average would 
be zero.' This statistic can be presumed toshow the relative accuracy, in dB, 
with which the different physical measures will predict subjective judgments 
of different noises. For example, it is seen in Table 3 that 67% of the 
noise having Max dBC levels that differed by as much as 10.90 dB (a plus or 
minus standard deviation of 5.45 dB would encompass in normal distributions 
about 67% of all cases or the noises) could be judged as subjectively equal; 
this percentage of the noises would be judged as equal when their Max 
values were within a range of but 3.84 dB (plus or minus a standard deviation 
of 1.92 dB) . 

It is clear from Table 3 that the results for the three groups are 
reasonably consistent with each other. Also, that the Max PNL units of physi- 
cal measurement are at least as good as the EPNL units. This finding, and a 
comparison of the relative accuracies of the different physical units of 
measurement in predicting the subjective judgments of these noises will be 
discussed later, after presentation of the results of the paired-comparison 
tests . 

Paired-Comparison Test . The standard deviations are shown in Table 4 of 
the distribution of differences for each of the groups, between the level of 
each of the standards, SI and S2, and each of the comparison noises when 
judged to be equally unacceptable and when measured in terms of the various 
physical units. Again, as with the magnitude estimation tests, the Max PNL 
units exhibit standard deviations or errors of prediction that are at least 
as small, on the average, as those for EPNL. Also, it is clear from Table 4 
that there are no large apparent differences among the results found when the 
longer duration (9 secs.) standard reference noise S2 was used compared to 
those obtained when the comparisons were made against the shorter (4 secs.) 
standard noise (SI) . 
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Table 4 - Part 1 PAIRED-COMPARISON TEST - MAX PNL 
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Table 4 - Part 2 PAIRED-COMPARISON TEST 
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Unit used by FAA-ISO and SAE in Air Craft Noise Evaluation Procedures 






























































































Predictive Accuracy of the Physical Units. The variable of spectrum con- 


tent of different noises has received the greatest research and engineering 
attention for purposes of noise control. Table 5 is a summary table of how 
well, in standard deviation terms, the various physical units of measurement 
predicted the results of the subjective judgment tests. It might be noted 
that the range of these standard errors is from about 2 dB from the best to 
over 6 dB for the worst; these values are similar in magnitude to those found 
in other well controlled comparative judgment tests that have been conducted 
in the past (Ref. 4) . 

The results given in Table 5 are in substantial agreement with previous 
experiments, in that the D overall frequency weighting, and the PNdBM third- 
octave band means of frequency weighting, in general, give better predictions 
of the subjective judgments than do the other units of physical measurement. 
Also, the tone corrections, t and t , show some utility in this regard; how- 

JL 

ever, as usual with rather complex experiments of the sort involved, the 
effects of the tone corrections are rather small. 

Of special interest is, perhaps, the finding that the units of physical 
measurement PLdB and dBE recently proposed by Stevens (Ref. 5) do not predict 
the subjective value of the noises involved in these tests any better than 
does dBA on the average, having average standard deviations of 2.54 for PLdB, 
2.84 for dBE, and 2.78 for dBA. 

In interpreting these data, it should be borne in mind that from a 
strictly statistical point of view a difference of about 0.25 to 0.50 dB - 
between two standard deviations of values of the order of 2.0 dB is signifi- 
cant with the number of data points (a total of 126) in these tests. Accord- 

* 

ing to the F test of statistical significance, an F of about 1.27 with an N, 
* 2 

F = S.D. Larger _ Ref n 

S.D.^ Smaller 
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number of data points, of 126 would be significant at the 95% level of con- 
fidence (Ref. 11). An F of 1.27 would be reached with a larger S.D. of 2.25, 
and a smaller S.D. of 2.00; a difference of 0.5 dB in the standard deviations, 
would be significant at a confidence level of 95% with an N of but about 42.' 
In the present study, combining the judgments of the 14 noises made by the 3 
groups of subjects under three separate test conditions (the method of paired 
comparison with two standards and the method of magnitude estimation) gives an 
N of 126. N for comparison of one Group of subjects and one test condition is 
of course, 14; for one Group and two test conditions, N is 28, etc. 

From a more practical point of view, it could be argued (without success 
with persons concerned primarily with overall environmental noise evaluation 
as compared to noise control at the source or in machinery design) that a 
difference of about 0.50 dB in these standard errors should be considered 
significant. The argument is based on the fact that according to normal pro- 
bability statistics it is reasonable to expect that populations of noises of 
the types studied will be subjectively about the same when their measured 
levels are within plus or minus three ( + 3) or six standard deviations. Thus, 
the range of levels for noises of similar subjective value would be of the 
order of, for Max PNL, 11.82 dB (S.D. of 1.97 x 6) for the weighting and 
14.70 dB (S.D. of 2.45 x 6) for the A weighting. The increased error range of 
3 dBA over the dBD 2 range of expected error could be important to achieving 
valid design and noise control goals in some cases, inasmuch as a difference 
in 3 dB is a matter of 100% in sound power. This is equivalent, for example, 
to a doubling (or a halving) of the number of engines on an aircraft for equal 
subjective effect. 

A more detailed statistical analysis of the Max PNL and EPNL values for 

the D and A weighted sound levels of the noises is presented in Table 6. It 

2 

is seen in Table 6 that, overall conditions: Max dBD^ is significantly better 

at the 90% level of confidence than Max dBA; EdBD is better than EdBA with a 

2 


20 



F Tests of Statistical Significance of Differences Between dBD 0 and dBA Results for 
Max PNL and EPNL According to Methods of Paired Comparison and Magnitude Estimation for all 
Three Groups of Subjects Combined. Results and Standard Deviations (S.D.) of Distributions 
of Physical Measures of Noises Judged to be Subjectively Equal in Noisiness. 
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more than 95% level of confidence; Max dBA is also more accurate than EdBA; 


and Max dBD barely misses being statistically better at the 90% level of 
2 

confidence than EdBD^. 


C omparison of Max PEL with EPNL . One of the surprising things revealed 
in Tables 3, 4, 5 and 6 is that the Max PNL units of physical measurement pre- 
dict so well the subjective judgments of the noises that varied so much in 
duration — sounds from 4 secs, to over 18 secs, duration. Indeed, Max PNL 
has an edge over EPNL. Such a result is usually found when the judged noises 
are of comparable durations, (in which case the duration effect is more or 
less a constant) , or the subjects seem to concern themselves solely with 'judg- 
ing the noises with respect to the peak levels reached by each of a variety of 
noises, rather than to judge how the longer duration noises might affect them 
in "real life." It has been demonstrated by some laboratory tests and in some 
field tests conducted with actual aircraft flyovers that similar spectra 
noises of longer duration are judged as less acceptable than the same noises 
when of shorter duration. 

It should perhaps be noted that except for the two standards, SI and S2, 
the various noises that differed greatly in duration were also of considerably 
different spectra type and that the contribution of the "skirt" energy of the 
noises that were of longer duration than the average would not be very large. 
More important, however, is probably that in spite of the instructions to 
"judge the whole noise" the subjects placed heavier weight in their judgments 
upon the "peak" levels of the noises as being, under the laboratory circum- 
stances, the most obvious aspect of the various noises that could readily and 
reliably be comparatively judged and subjectively quantified. It is hypothe- 
sized that the independent variations of such factors as duration, spectral 
shape and complexity, and rates of the growth and decay of the noise, tend to 
force the subjects making judgments of a conglomeration of such differing 
noises, to attend primarily to such common features as the general spectral 
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content and peak level. 


Differences Between Test Methods . In an earlier study (Ref. 1) it was 
found that the methods of paired comparison and magnitude estimation gave 
comparable results, both in terms of reliability and the general conclusions 
regarding the predictiveness of various physical measures of the noise. The 
present data, by and large, substantiate these findings, as shown in Table 7 
for each group of subjects. In addition, as shown more succinctly in Table 8 
when the data for all three groups are combined, there are no significant 
differences between the results obtained when either standard was used with 
the. method of paired comparison or between the results obtained with the 
method of paried comparison and the method of magnitude estimation. 

Group Differences . Inasmuch as the subjects were assigned to the three 
groups on essentially a random basis, consistent differences between the 
results for the groups would presumably be attributable to the effects of 
differences in the instructions to the groups and/or the order in which the 
tests were administered. It is seen in Table 9 that the paired comparison 
judgments made by Group I were generally less variable, and with some statisti 
cal significance, than those made by Groups II and III, and that Group II was 
slightly less consistent in their paired comparison judgments than was Group 
III. However, the three groups performed about equally well in their magni- 
tude estimation judgments. 

These relations are perhaps better illustrated in Table 10 where it is 
seen that, except for Max PNL, Group II subjects were more variable in their 
judgments, as predicted by the physical measurements of the noise, than were 
the subjects in Groups I or III at either the 5% or 10% level of confidence. 
Thus, it might be conjectured that the paired comparison test is adversely 
influenced by previous experience with a magnitude estimation test, but that 
the reverse is not true. I t is more likely, however, that the general 
increase in the size of the standard deviations for the Group II paired 
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Table 7. "f" Tests of Statistical Significance of Differences Between 

Results, in Max dBDg and EdBD 2 , for Methods of Paired Comparison and Magnitude 
Estimation Tests for Each Group of Subjects Separately. 




Paired Comparison Reference Standards 






Reference Standard 1 vs Reference Standard 2 





Max dBDg 




EdBD 2 


Group 

I 


II 

III 

Group 

i 

II 

Ill 

SD 

SI 



2.41 

— — 


1.70 

2.95 . 

2.49 

SD 

S2 

1 1 


3.05 

B 


1 1 

3.35 

2.93 



F 

1.14 

1.60 

* 

2.99 

F 

Gl!> . 

1.29 

1.38 




Magnitude Estimation 

vs Paired Comparison 





Magnitude Estimation vs Reference Standard SI 

t 

Group 

I 

II 



I 

II 

III 

SD 

ME 

2.20 

1.53 

. 

2.03 

SD 

ME 

1.89 

2.00 

2.26 

SD 

SI 

1.75 

2.41 

1.15 

SD 

SI 

1.70 

2.95 

2.49 

F 

1.58 

** 

2.48 

* 

3.12 

1 

1.24 

** 

2.18 

1.21 

Magnitude Estimation vs Reference Standard S2 

Group 

I 

ii 

hi 

Group 

I 

ii 

III 

SD 

ME 

2.20 

1.53 

2.03 

SD 

ME 

1.89 

2.00 

2.26 

SD „x 
S2 ' 

1.64 

3.05 

1.99 

SD 

S2 

1.79 

3.32 

2.93 

F 

1.80 

* 

3.97 

1.04 

F 

1.11 

* 

2.76 

1.68 

Magnitude Estimation vs Both Standards Combined 

Group 

I 

ii 

III 

Group 

I 

ii 

III 

SD 

ME 

2.20 

1.53 

2.03 

SD 

ME 

1.89 

2.00 

2.26 

m 

2.10 

2.86 

2.19 

tiiumi 

HH 

2.07 

3.27 

2.98 

F 

1.10 

3.49 * 

1.16 


1.20 

2.67 

1.74 


* 

Significant at 95% level of confidence 
**Signif icant at 90% level of confidence 
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Table 8 


"F" Tests of Statistical Significance of Differences Between Results in 
Max dBD 2 and EdBD 2 for Methods of Paired Comparison and Magnitude Estimation 
Tests for all Three Groups of Subjects Combined. 


Paired Comparison Reference Standards 

Reference Standard 1 vs. Reference Standard 2 


Max dBD 

2 


Max dBD„ 
2 

SD 

2.19 

F 1.58 

2.75 


EdBD 

2 


EdBD 

2 

SD 

2.76 

F 1.25 

3.09 

Magnitude Estimation vs. 

Paired Comparison 


M .E . vs . 

P.C. Reference Standard 1 


Max dBD 

2 


Max dBD 

2 

SD 

2.68 

F 1.50 

2.19 


EdBD 

2 


EdBD 

2 

SD 

3.14 

F 1.29 

2.76 


M.E . 

vs. Reference 

Standard 2 


Max dBD 

2 


Max dBD„ 
2 

SD 

2.68 

F 1.05 

2.75 


EdBD 

2 


EdBD 

2 

SD 

3.14 

F 1.03 

3.09 


M.E. vs 

. Both Standards Combined 



Max dBD„ 
2 


SD 

2.68 

F 1.06 

2.76 



EdBD 

2 


SD 

2.82 

F 1.24 

3.14 


5 









Est. I PC , SI fc S2 I PC, S2 PC, SI 


Table 9 


"F" Tests of Statistical Significance Between Results in Max dBDg and 
EdBD2, of the Different Groups of Subjects for the Methods of Paired Compari- 
son and Magnitude Estimation. 


Group I vs 

i Group II 

1.75 

2.41 

1 .£ 

** 

11 

1.70 

2.95 

* 

3.01 


|SD| 2.07 


<a EdBD SD 
S 2 


1.15 


** 


1.64 [ 1.19 

_ 1 . 47 

1.79 j 2.93 

** 

2 . 68 . 


2.10 2.19 


1.09 


2.19 


** 


3.27 

2.07 

2.98 

3.27 


' 2.07 


2.20 2.03 


Significant at 95% level of confidence 
Significant at 90% level of confidence 
















































Table 10 


"F” Tests of Statistical Significance of Differences Between Results, in 
Max dBD 2 and EdBD 2 , of the Different Groups of Subjects for Methods of Paired 
Comparison, both Reference Standards, and Magnitude Estimation Combined. 


Group I 

vs Group II 

Group I vs Group III 

Group II 

vs Group III 

Max dBD 

2 

Max dBD 

2 

Max dBD 

2 

Max dBD 

2 

Max dBD„ 
2 

Max dBD 

2 

SD 2.47 

2.62 

2,47 

2.12 

2.62 

2.12 

F 

1.13 

F 

1.36 

F 

1.53 ** 

EdBD 

2 

EdBD 

2 

EdBD 

2 

EdBD„ 

2 

EdBD 

2 

EdBD„ 

2 

SD 2.43 

3.00 

2.43 

2.74 

3.00 

2.74 

F 

1 .52 ** 

F 

1.27 

F 

1.99 * 

Group I 

- Long Form Instructions; 

P.C. Tests First, M.E. Second. 

Group II 

- Short Form Instructions; 

M.E. Tests First, P.C. 

Second . 

Group III 

- Short Form Instructions; 

P.C. Tests First, M.E. 

Second . _ 


* 

Difference Significant at 95% Level of Confidence. 

* 4 * 

Difference Significant at 90% Level of Confidence. 


27 




comparison tests, in comparison with the other results, is not significant 
and was due to some unidentified experimental error. In any event, the results 
obtained with Group I, who received the longer, more detailed instructions, 
were not significantly different from those found with Group III, who received 
the shorter set of instructions. 

Classes of Noise . One of the purposes of the present tests was to deter- 
mine whether or not commonly used physical measurements could be used as well 
for the evaluation of so-called VSTOL-type noise as for fixed wing, jet air- 
craft noise. The detailed data with respect to. the physical units of dBA and 
dBD^ for each of the noises evaluated are presented in Table 11A, B and C. 
Casual examination of Table 11 appears to show no striking pattern between the 
different types of noises and the proficiency with which their subjective 
ratings were predicted by these three physical units of noise measurement. 
However, as shown in Table 12, grouping the noise by type does reveal that 
possibly the subjective effect of the noise from the air-conditioner (AC) is 
not predicted quite as well as are the effects of the aircraft noises. Because 
only one air-conditioner noise was tested it is, of course, not possible to 
consider this finding as reliable. It is to be noted in Table 12 that the 
VSTOL noises and the noises from the typical jet aircraft are predicted by 
the physical measures about equally well. 

Magnitude Scale of Perceived Noisiness . A special feature of the magni- 
tude estimation test procedure is that a ratio scale of the subjective quan- 
tity (noisiness or unacceptability) is obtained from numerical values ascribed 
to the noises by the subjects, i.e., a noise given the number "twenty” pre- . 
sumably being twice as unacceptable to the listener as a noise given the 
numerical rating of 10. It has typically been found in the past that when 
judging non-impulsive noises, at least in the mid-to-high range of intensities, 
a 10 dB increase in the sound pressure level of the noise would cause a 
doubling in its subjective loudness or noisiness. 
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Table 11A 









































Table 11B 
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Table 12 


Average Levels in dB for the units dBA and dBDg for 4 classes 
of Noises Judged to be Equally Acceptable in the Paired-Comparison 
and Magnitude-Estimation Tests. 



MAX PNL 

EPNL 

Max dBA 

Max dBD 2 

EdBA 

EdBD 2 

Av. VSTOL 

79.70 

84.56 

90.0 

94.58 

Av. Fixed 
Wing Jet 

79.01 

84.85 

88.19 

93.30 

Air-Conditioner 

80.11 

84.84 

91.61 

96.61^ 

Pink Noise 

77.64 

84.16 

86.84 

93.93 

Mean of all 
14 Noises 

79.14 

84.70 

88.83 

94.10 


Difference From Means of All 14 Noises 



MAX PNL 

EPNL 

Max dBA 

Max dBD 2 

EdBA 

EdBD 2 

Av. VSTOL 

+0.56 

-0.14 

+1.17 

+0.48 

Av .' Fixed 
Wing Jet 

-0.13 

+0.15 

-0.64 

-0.8 

Air Conditioner 

+0.97 

+0.14 

+2 . 78 

+2.51 

Pink Noise 

-1.5 

-0.54 

-1.99 

-0.17 



The results of magnitude estimation tests are tabulated for the highest 
and lowest levels of each noise in Table 13. Fig. 6 is a summary plot of the 
averages for all the noises when presented at three different levels of inten- 
sity. It is seen in Fig. 6 that the traditional doubling in the perceived 
noisiness occurs with a 10 dB increase in intensity. 
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MAGNITUDE ESTIMATES, AND RATIOS, FOUND FOR NOISES 
WHEN AT DIFFERENT LEVELS OF INTENSITY 






















































































































































































































































































Ratios Between ME's Given to Different 
Levels of Same Noise 
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dB Diff. Between Two Levels of Same Noise 

8 dB 

10 dB 

16 dB 

Group I 

1.85 

2.4 

4.1 

Group II 

1.78 

1.79 

3.01 

Group III 

1.80 

2.12 

3.27 

Average 

1.81 

2.10 

3.46 j 



4 6 8 10 12 14 16 18 


Difference Between Lowest Level and 
Higher Level of Same Noise - dB . 


FIGURE 6. Showing ratio of magnitude estimations of subjective noisiness or 
unacceptableness of a noise as function, of difference in physical level of two 
presentations of same noise. 'V 
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CONCLUSIONS 


On the basis of the present experimental results and related considera- 
tions presented in the discussion, it is concluded that: 

1. The standard deviations with which the various units of physical 
noise measurement predicted the subjective judgments of perceived noisiness or 
unwantedness were approximately the same for both the methods of magnitude 
estimation and paired comparison. 

2. The relative accuracy with which the physical units of measurement 
predicted the subjective judgments was as good for the maximum level reached 
by each noise as for so-called effective or time integrated levels of the 
noise measurements. This perhaps somewhat anomalous finding is ascribed to 
possible difficulties subjects have in such laboratory tests in attending 
simultaneously to more than one major and variable physical aspect of noises 
when several of these aspects are non-systematically varied among the noise 
stimuli. 

3. Statistically significant and often practically significant differ- 
ences were found in the proficiency with which the different frequency weight- 
ing procedures, both overall and one-third octave band, predicted the sub- 
jective judgment test results. Frequency weightings and PNdBM were consis- 
tently better by a small but probably a practically significant amount (about 
3.5 dB over a range of + 3 standard deviations) than the overall frequency 
weightings of A and E and the one-third octave band procedure of PLdB . 

4. The physical units of dBA, dBD and PNdBM predicted the subjective 

2 1 1 

judgments of the VSTOL type aircraft noises as well as they predicted the 
judgments of the noises from the typical fixed wing jet aircraft. 

5. The scale of perceived noisiness as determined from the magnitude 
estimation tests was consistent with that typically found in the past for both 
loudness and perceived noisiness, namely, a doubling of perceived magnitude 
for each increase of 10 dB in intensity. 
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BRIEF INSTRUCTIONS - PAIRED COMPARISON 

You will hear a series of sounds from aircraft. The sounds will occur in 
"pairs" and your task is to judge which sound in each pair you think would be 
more unacceptable, bothersome or annoying to you. 

After you have heard each pair of sounds, please quickly decide which of 
the two you feel would be more unacceptable or annoying- to you. If you think 
the first sound of a pair to be more unacceptable, circle A for that particular 
pair. If you think the second sound in the pair would be more unacceptable to 
you than the first, circle B. If you feel that there is absolutely no real 
difference in terms of acceptability of the two sounds, please circle either 
A or B, giving the best guess you can. 

An announcement of the item number will be made before each 5th pair of 
sounds is to occur. The sounds of a pair will be separated by a few seconds. 
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LONG INSTRUCTIONS - PAIRED COMPARISON 


The primary purpose of the tests being conducted is to determine, if pos- 
sible, how people feel about the relative unacceptability or annoyingness of 
one type or level of aircraft noise when compared with a second type or level 
of aircraft noise. 

You will hear a series of sounds from aircraft. The sounds will occur in 
"pairs" and your task is to judge which sound in each pair you think would be 
more unacceptable, bothersome or annoying to you if heard in or near your home 
during the day and/or evening when you are engaged in typical, awake activities 
Judge how the whole, entire sound or noise would affect you - not just the peak 
level but the whole noise from beginning to end as though you were in your home 

After you have heard each pair of sounds, please quickly decide which of 
the two you feel would be more unacceptable or annoying to you. If you think 
the first sound of a pair to be more unacceptable, circle A for that particular 
pair. If you think the second sound in the pair would be more unacceptable to 
you than the first, circle B. 

Please concentrate on the judgment at hand and give an answer even though 
the two sounds may seem approximately equal in acceptability or unacceptability 
to you. If you feel that there is absolutely no real difference in terms of 
acceptability of the two sounds, please circle either A or B, giving the best 
guess you can, and put a question mark after that pair. 

There are no "right" or "wrong" answers, nor do we expect people to agree 
with each other.- We are interested in how you feel about the sounds and how 
people differ in their judgments of these aircraft sounds in their entirety in 
or near your home. 

An announcement of the item number will be made before each 5th pair of 
sounds is to occur. The sounds of a pair will be separated by a few seconds. 
During the test period, which will be approximately 20 minutes, please remain 
quiet and attentive. Give us your best judgment and imagine, if you will, that 
you are listening to these sounds in or near your own home. 
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LONG INSTRUCTIONS - MAGNITUDE ESTIMATION 

The primary purpose of the tests being conducted is to determine, if pos- 
sible, how people feel about the relative unacceptability or annoyingness of 
different types of aircraft noise. 

You will hear a series of sounds from aircraft. Your task is to judge how 
unacceptable, bothersome or annoying each sound would be to you if heard in or 
near your home during the day and/or evening when you are engaged in typical , 
awake activities. Judge how the whole, entire sound or noise would affect you 
not just the peak level but the whole noise from beginning to end as though you 
were in your own home. 

First, we will produce a sound whose noisiness score is 10. This will be 
the first sound after the announcement "begin test." Use that sound as a 
standard, and judge each succeeding sound in relation to that standard. For 
example, if a sound seems twice as noisy or annoying or unacceptable as the 
standard, you will write 20 in the appropriate box on the answer sheet. If it 
seems only one-quarter as noisy, write 2.5. If it seems three times as noisy, 
write 30; one-half as noisy, write 5,. and so on. 

Please concentrate on the judgment at hand and give an answer that tells 
how strong the annoyance seems to you. There are no "right" or "wrong" answers 
nor do we expect people to agree with each other. We are interested in how you 
feel about the sounds and how people differ in their judgments of these air- 
craft sounds in their entirety in or near your home. 

An announcement of the item number will be made before each 5th sound. 

The sounds will be separated by a few seconds. During the test period, which 
will be approximately 20 minutes, please remain quiet and attentive. Give us, 
your best judgment and imagine, if you will, that you are listening to these 
sounds in or near your own home . 


NASA-Langley, 1974 CR— 2U71 
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proceedings with either limited or unlimited 
distribution. 

CONTRACTOR REPORTS: Scientific and 
technical information generated under a NASA 
contract or grant and considered an important 
contribution to existing knowledge. 


TECHNICAL TRANSLATIONS: Information 
published in a foreign language considered 
to merit NASA distribution in English. 

SPECIAL PUBLICATIONS: Information 
derived from or of value to NASA activities. 
Publications include final reports of major 
projects, monographs, data compilations, 
handbooks, sourcebooks, and special 
bibliographies. 

TECHNOLOGY UTILIZATION 
PUBLICATIONS: Information on technology 
used by NASA that may be of particular 
interest in commercial and other non-aerospace 
applications. Publications include Tech Briefs, 
Technology Utilization Reports and 
Technology Surveys. 


Details on the availability of these publications may be obtained from: 

SCIENTIFIC AND TECHNICAL INFORMATION OFFICE 

NATIONAL AERONAUTICS AND SPACE ADMINISTRATION 

Washington, D.C. 20546 




